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ARITHMETIC  SKILLS  IN  USING  ALGORITHMS 


An  algorithm  is  a  series  of  steps  or  operations  that,  when 
sequentially  applied,  produces  a  solution  to  a  problem.  Properly 
applied,  algorithms  are  helpful  when  a  complex  or  difficult  numerical 
question  can  be  decomposed  into  sub-questions.  Answers  are  given  to 
the  sub-questions;  these  components  are  then  recomposed,  via  the 
algorithm,  to  arrive  at  an  answer  to  the  original,  target  question. 

In  a  series  of  experiments,  we  have  been  exploring  the  techniques 
of  algorithmic  decomposition  as  an  aid  to  numerical  estimation 
(MacGregor,  Lichtenstein,  &  Slovlc,  in  press;  Lichtenstein  &  MacGregor, 
1984;  Lichtenstein,  MacGregor  &  Slovlc,  1987;  Lichtenstein  &  Weathers, 
1987).  Although  the  techniques  lead  to  improvements,  some  subjects  are 
led  seriously  astray  by  the  very  methods  that  are  intended  to  help 
them. 

We  have  previously  reported  on  one  source  of  subjects'  poor 
performance:  misinformation.  We  asked  subjects  to  estimate  apparently 
obscure  numerical  facts  (such  as  the  number  of  pounds  of  potato  chips 
consumed  yearly  in  the  U.S.)  by  decomposing  each  question  into  a  series 
of  questions  the  answers  to  which  are  easier  to  estimate  (e.g.,  pounds 
of  potato  chips  consumed  per  capita  per  week,  number  of  weeks  in  a 
year,  and  population  of  the  U.S.).  The  success  of  such  an  approach 
relies  in  part  on  the  subjects'  knowledge  of  these  easier  elements. 

But  we  found  substantial  amounts  of  misinformation  (Lichtenstein, 

1987).  For  example  only  311  of  our  subjects  knew  how  many  feet  there 
are  in  a  mile;  only  571  estimated  the  population  of  the  United  States  with 
an  eight-number  digit.  Seriously  erroneous  beliefs  (e.g.,  that  the  U.S. 
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population  is  three  billion)  can  doom  the  effectiveness  of  algorithmic 
decomposition. 

The  present  paper  reports  on  another  source  of  problems  in  using 
algorithms:  weakness  in  arithmetic  skills.  In  order  to  exclude  the 
problem  of  misinformation,  we  focus  here  on  an  algorithm  that  requires 
no  estimation  skills.  This  algorithm  is  based  on  the  use  of  Bayes' 
Theorem  to  solve  a  class  of  problems  in  combining  probabilistic 
evidence;  these  problems  have  been  called  base-rate  problems  (see, 
e.g.,  Bar-hillel,  1980).  The  two  problems  we  used  are  shown  in  Table 
1.  The  problems  have  different  cover  stories  but  are  structurally  the 
same. 


Insert  Table  1  about  here 


Subjects.  The  subjects  were  76  paid  volunteers  who  responded  to 
ads  in  the  University  of  Oregon  student  newspaper.  The  present  task 
was  completed  along  with  several  other  unrelated  paper  -and-pencil  tasks 
in  a  one-  to  two-hour  period.  The  subjects  were  run  in  groups  in  a  f 

(V 

large  university  classroom.  Each  subject  received  one  of  the  two  \  < 


problems  shown  in  Table  1. 

Instructions  and  Algorithm.  The  instructions  said: 


In  this  task  we  would  like  you  to  work  through  a  problem  asion  For 


by  carefully  following  a  number  of  detailed  steps.  First,  TAB^*^ 

aouncod  □ 

if  1 cat Ion _ 


you  will  read  through  the  problem.  Then,  you  will  follow  a 
series  of  steps,  some  that  ask  you  to  pull  information 


directly  from  the  problem  itself,  and  others  that  ask  you  to  ri  but  Ion/ _ 

‘.lability  Codes 

2  jAvall  and/or 

Dist  |  Spools! 


Table  1 


The  Base  Rate  Problems 


Light  Bulb 


Consider  the  following  problem: 

A  light  bulb  factory  uses  a  scanning  device  which  is  supposed 
to  put  a  nark  on  each  defective  bulb  it  spots  in  the  assembly  line* 
Eighty-five  percent  (85%)  of  the  light  bulbs  on  the  line  are  OK; 
the  remaining  1ST  are  defective. 

The  scanning  device  is  known  to  be  accurate  in  80%  of  the 
decisions,  regardless  of  whether  the  bulb  Is  actually  OK  or 
actually  defective.  That  is,  when  a  bulb  is  good,  the  scanner 
correctly  identifies  it  as  good  80%  of  the  time.  When  a  bulb  is 
defective,  the  scanner  correctly  marks  it  as  defective  80%  of  the 
time. 

Suppose  someone  selects  one  of  the  light  bulbs  from  the  line 
at  random  and  gives  it  to  the  scanner.  The  scanner  marks  this  bulb 
as  defective. 

What  is  the  probability  that  this  bulb  is  really  defective? 


(table  continues) 
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Table  1  (continued) 


Dyslexia 


Dyslexia  is  a  disorder  characterized  by  an  impaired  ability  to 
read.  Two  percent  (2Z)  of  all  first  graders  have  dyslexia.  A 
screening  test  for  dyslexia  has  recently  been  devised  that  can  be 
used  with  first  graders.  The  screening  test  is  cheap  and  easy  to 
administer;  It  identifies  those  children  who  will  later  be  given  a 
more  extensive  test  to  determine  for  sure  whether  the  child  has 
dyslexia.  The  screening  test  is  not  completely  accurate.  For 
children  who  really  have  dyslexia,  the  screening  test  is  positive 
(indicating  dyslexia)  95Z  of  the  time.  But  it  also  gives  a 
positive  (dyslexia)  result  for  5%  of  the  normal  children,  the  ones 
who  do  not  have  dyslexia. 

A  first  grader  is  given  the  screening  test  and  the  result  is 
positive,  indicating  dyslexia. 

What  is  the  probability  that  the  child  really  has  dyslexia? 


carry  out  basic  arithmetic.  Please  follow  all  the  directions 
carefully.  Pay  special  attention  to  the  accuracy  of  your 
arithmetic.  This  is  not  a  test  of  your  ability  to  do 
arithmetic,  but  accuracy  of  computation  is  essential  to  what 
we  are  asking  you  to  do. 

[The  problem  followed.] 

After  the  problem  was  an  algorithm  composed  of  thirteen  steps,  as  shown 
for  the  Llghtbulb  problem  in  Table  2.  On  the  page  following  the 
algorithm,  two  additional  questions  were  asked: 

Do  you  think  the  answer  in  (M)  is  a  sensible  answer  to  the 
question,  "What  is  the  probability  that  this  llghtbulb  is 
really  defective  [the  child  really  has  dyslexia]? 

Yes  _  No  _ 

If  you  answered  No,  what  do  you  think  is  a  sensible 
answer? 


Insert  Table  2  about  here 


Results.  The  correct  answer  (to  two-digit  accuracy)  for  the 
Llghtbulb  problem  is  .41;  for  the  Dyslexia  problem,  .28.  Only  17  of 
the  76  subjects  (22X)  gave  the  correct  answer.  Table  3  shows  the 
answers  the  subjects  gave,  categorized  according  to  ranges  around  the 
correct  answer: 

Too  Low:  Answers  falling  more  than  .10  below  the  correct  answers. 

About  Right:  Answers  within  -.10  of  the  correct  answer,  including 
all  correct  answers. 
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Table  2 


Algorithm  for  Che  Llghtbulb  Problem 


(B)  Subtract  your  estimate  in  (A)  from  1,000  to  get  the  number  of 
bulbs  out  of  1,000  that  are  NOT  defective, 

1,000  -  (A) _  - _ (B) 

(C)  What  percentage  of  the  time  is  the  scanner 
able  to  correctly  identify  light  bulbs  that 

are  actually  defective?  (from  the  problem)  _ (C) 

(D)  What  percentage  of  the  time  is  the  scanner 
able  to  correctly  identify  light  bulbs  that  are 

actually  not  defective?  (from  the  problem)  _ (D) 


(table  continues) 


Table  2  (continued) 


(E)  Look  over  the  following  table: 

LIGHT  BULBS  ARE: 

Actually  Defective  Not  defective 

Scanner 
Says  IS 

Defective  _ 

(L) 

Scanner 
Says  IS  NOT 
Defective 


(F)  Write  the  number  of  defective  light  bulbs  from  (A)  on  the  line 
labeled  (A)  in  the  table  above,  just  below  Box  #2. 

(G)  Write  the  number  of  non-defective  light  bulbs  from  (B)  on  the 
line  labeled  (B)  in  the  table  above,  just  below  Box  #3. 

(H)  Multiply  the  percentage  value  in  (C)  by  your  estimate  from 
(A).  (First  convert  the  percentage  value  to  a  decimal  value 
before  multiplying.) 

(A)  _  x  (C)  _  -  _ (H) 

Write  your  value  for  (H)  in  Box  #1. 

(table  continues) 


Actually  Defective  Not  defective 


+ 


(A)  (B) 
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Table  2  (continued) 


(I)  Subtract  your  value  in  (H)  from  your  value  in  (A). 

(A  ) _  -  (H)  _  - _ (I) 

Write  your  value  for  (I)  in  Box  #2. 

(J)  Multiply  the  percentage  value  in  (D)  by  your  e8timate  from 
(B)*  (First  convert  the  percentage  value  to  a  decimal  value 
before  multiplying.) 

(B)  _  x  (D)  _  - _ (J) 

Write  your  value  for  (J)  in  Box  #3. 

(K)  Subtract  your  value  in  (J)  from  your  value  in  (B). 

(B)  _  -  (J)  _  -  (K) 

Write  your  value  for  (K)  in  Box  #4. 

(L)  Add  the  numbers  in  Boxes  #1  and  #4. 

Box  #1 _  +  Box  #4  :  -  (L) 

Write  your  value  for  (L)  on  the  line  labeled  (L),  to  the  right 
of  the  boxes. 

(M)  To  get  the  final  answer,  divide  your  value  in  Box  #i  by  your 
value  for  (L). 

Box  #1 _ +  CL)  _  -  (M) 
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Too  High:  Answers  that  are  sore  than  .10  above  the  correct  answer 
but  below  1.00  (no  subject  got  an  answer  of  exactly  1.00). 

Outside:  Negative  answers  and  answers  greater  than  1.00. 

None:  No  numerical  answer  given. 


Insert  Table  3  about  here 


Is  your  answer  sensible?  As  shown  In  Table  3,  most  subjects 
thought  that  their  answers  were  sensible.  Base-rate  problems  are 
notorious  for  having  nonintuitive  answers,  so  it  is  perhaps  not 
surprising  that  subjects  who  arrived  at  about  the  right  answer  were 
less  likely  to  think  their  answer  was  sensible  (55%)  than  subjects 
who  arrived  at  answers  that  were  within  the  range  but  too  high  or 
too  low  (77X).  Most  discouraging  is  that  more  than  half  the  subjects 
whose  answers  fell  outside  the  bounds  of  0  to  1  were  satisfied  with 
their  answers. 

Of  the  26  subjects  who  said  their  answer  was  not  sensible,  only  20 
gave  a  new  answer.  Only  one  of  these  revised  answers  was  close  to 
being  correct;  this  subject  had  completed  the  algorithm  perfectly, 
arriving  at  the  answer  of  .41  to  the  light  bulb  problem,  but  said  that  a 
sensible  answer  was  .35.  Twelve  of  the  20  revised  answers  fell  in  the 
Too  Low  category,  supporting  the  finding  (shown  in  Table  3)  that  most 
(82%)  of  the  subjects  who  had  originally  calculated  a  low  answer  found 
it  sensible. 

Errors.  The  subjects  made  numerous  errors  in  the  task.  These 
errors,  categorized  and  tallied,  are  shown  in  Table  4. 
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Table  3.  Frequency  of  responses 


Frequency 
of  Sensibleness 


Frequency 
of  Answer 

Yes 

No 

Blank 

Too  Low 

18 

14 

3 

1 

About  Right 

29 

16 

13 

- 

Too  High 

9 

6 

3 

- 

Outside 

17 

10 

7 

- 

None 

3 

- 

- 

3 

76 

46 

26 

4 

azssaasssssasasasasksssaassa! 
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Insert  Table  4  about  here 


The  Dyslexia  group  were  particularly  prone  to  the  error  of  taking 
the  wrong  information  from  the  story.  The  Dyslexia  story  differs  from 
the  Lightbulb  story  by  expressing  the  two  pieces  of  diagnostic 
information  in  two  different  ways: 

.  .  .  For  children  who  really  have  dyslexia,  the  screening 
test  is  positive  (indicating  dyslexia)  95Z  of  the  time.  But 
it  also  gives  a  positive  (dyslexia)  result  for  5Z  of  the 
normal  children,  the  ones  who  do  not  have  dyslexia. 

Many  subjects  were  apparently  confused  by  this  wording,  so  that  when 
the  algorithm  asked,  "What  percentage  of  the  time  is  the  screening  test 
able  to  correctly  Identify  children  that  actually  do  not  have 
dyslexia?"  (emphasis  in  the  original),  45Z  of  the  subjects  filled  in 
5%.  One  subject  even  went  so  far  as  to  write  us  a  note  in  the  margin 
saying  that  this  question  was  incorrectly  worded. 

The  algorithm  several  times  required  subjects  to  copy  a  previous 
calculation  into  a  new  spot.  About  a  third  of  the  subjects  made  errors 
in  following  these  simple  directions. 

We  categorized  arithmetic  errors  as  (a)  errors  in  sign,  (b) 
addition  or  subtraction,  or  (c)  multiplication  or  division.  Within  the 
multiplication  or  division  errors  we  further  distinguished  decimal 
errors,  upside-down  division,  and  other  errors.  Upside-down  division 


11 


Table  4 


Error  Analysis .  in  Percentages 


All  Lightbulb  Dyslexia 
(n-76)  (n-29)  (n-47) 


Urong  Info  from  Story 
One  or  More  Copying  Error 
One  or  More  Arithmetic  Error 
Sign  Error 

Addition  or  Subtraction 
Multiplication  or  Division 
Decimal  Error 
Upside-down  Division 
Other 

Incomplete  Algorithm 
No  Errors 


47 

28 

60 

32 

28 

34 

54 

34 

66 

8 

7 

9 

13 

14 

13 

51 

31 

64 

22 

17 

26 

13 

3 

19 

30 

21 

36 

4 

3 

4 

22 

41 

11 

2.21 

2.89 

Mean  No.  Errors  per  Subject 
Most  Errors  by  One  Subject 
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is  the  calculation  of  the  inverse  of  the  indicated  divlson,  for 
example : 

50  -r-  2  -  .04  or  2  *1-  50  -  25  . 

The  Dyslexia  subjects  showed  a  significantly  greater  frequency  of  one 

2 

or  more  arithmetic  errors,  x  ■  7.03,  p  >  .01,  a  finding  we  cannot 
explain. 

An  Incomplete  algorithm  received  only  one  tally  for  that  reason, 
regardless  of  how  many  steps  were  omitted. 

The  errors  made  by  the  Algorithm  subjects  sometimes  led  to  absurd 
answers;  as  shown  in  Table  3,  222  of  the  answers  were  outside  the  range 
of  permissible  probabilities,  either  negative  or  greater  than  1.00. 

The  largest  answer  was  4934.4;  this  subject  made  three  decimal  errors, 
one  copying  error,  one  multiplication  error,  one  sign  error,  and  ended 
with  an  upside-down  division. 

Additional  Data.  An  additional  group  of  102  subjects  were  given 
the  same  problems  with  a  different  aid.  Before  reading  the  problem, 
these  subjects  read  a  lengthy  (six  single-spaced  pages)  tutorial 
designed  to  teach  the  subjects  how  to  solve  base-rate  problems.  The 
method  presented  was  based  on  the  2-by-2  table  that  formed  the  center 
of  the  algorithm,  but  in  contrast  with  the  algorithm,  the  tutorial 
emphasized  understanding  and  common  sense  (for  further  details,  see 
Lichtenstein  &  MacGregor,  1984).  After  the  tutorial  each  subject 
received  one  of  the  two  problems  with  a  worksheet.  The  Lightbulb 
version  of  this  worksheet  is  shown  in  Table  5.  As  may  be  seen,  it  is 
shorter  and  requires  the  subjects  to  make  judgments. 

Of  these  102  subjects,  35  received  the  task  in  the  usual  large 
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Table  5 


Worksheet  for  the  Tutorial  Group 


Please  work  the  following  problem  uelng  the  method  juet  described. 

We've  drawn  you  a  table  to  work  with. 

A  light  bulb  factory  uses  e  scanning  device  which  Is  supposed  to  put 
a  mark  on  each  defective  bulb  It  spots  In  the  assembly  line.  Eighty- 
five  percent  (85Z)  of  the  light  bulbs  on  the  line  are  OK;  the  remaining 
15Z  are  defective. 

The  scanning  device  Is  known  to  be  accurate  in  80Z  of  the  decisions, 
regardless  of  whether  the  bulb  Is  actually  OK  or  actually  defective. 

That  is,  when  a  bulb  Is  good,  the  scanner  correctly  indentlfies  it  aa 
good  80Z  of  the  time.  When  a  bulb  Is  defective,  the  scanner  correctly 
marks  It  as  defective  80Z  of  the  time. 

Suppose  someone  selects  one  of  the  light  bulbs  from  the  line  at 
random  and  gives  it  to  the  scanner.  The  scanner  marks  this  bulb  as 
defective. 

What  is  the  probability  that  this  bulb  Is  really  defective! 


Step  1.  Draw  a  table.  Done. 

Step  2.  Label  the  table. 

Step  3.  Assign  an  arbitrary  grand  total.  Use  1,000. 

Step  4.  Estimate  the  population  totals.  First  decide  which  set  of 
information  is  population  Information.  Then  divide  the  1,000  into  two 
parts,  using  information  from  the  problem. 

Step  5.  Fill  in  the  cells.  Divide  each  of  your  estimated  totals  among 
Its  two  cells,  according  to  the  information  in  the  problem. 

Step  6.  Cross  out  the  false.  Cross  out  the  two  cells  that  are 
contradicted  by  the  Information  given  in  the  problem. 

Step  7.  Find  the  needed  probability.  Write  the  relevant  numbers  in  the 
top  and  bottom  of  the  fraction  and  convert  the  fraction  to  a  decimal 
answer. 


f  In  target  cell 
Sum  of  f's  In  both  cells 


answer . 
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classroom  groups.  The  other  67  were  run  In  small  groups  of  4  to  7 
people,  with  fewer  other  tasks  and  with  small,  battery-operated 
calculators  available  for  use. 

Thirty-one  percent  of  these  subjects  arrived  at  the  right  answer, 
the  same  percentage  among  those  given  the  task  in  a  large  classroom  and 
among  those  who  were  run  in  small  groups. 

The  format  of  the  worksheet  for  these  subjects  did  not  permit  as 
detailed  an  analysis  of  errors.  Arithmetic  errors  were  found  for  43Z 
of  the  large-group  subjects  (not  much  less  than  the  54Z  arithmetic 
error  rate  for  the  Algorithm  subjects)  and  18Z  for  the  small-group 
subjects  (who  were  encouraged  but  not  required  to  use  calculators). 

Discussion 

This  paper  has  identified  and  detailed  a  serious  barrier  to  the 
effective  use  of  algorithms:  weak  mathematical  skills.  Given  an 
algorithm  of  13  steps  requiring  copying,  converting  from  percentages  to 
proportions,  adding,  subtracting,  multiplying,  and  dividing,  78Z  of  our 
subjects  made  one  or  more  errors  in  the  task. 

One  should  not  generalize  these  results  to  the  U.S.  population  at 
large.  These  subjects  were,  with  few,  if  any,  exceptions,  college 
students  at  a  state  university.  As  such,  they  are  above  average  in 
intelligence  and  education.  But  they  may  be  reasonably  representative 
of  many  groups  for  whom  decision  aids  are  designed,  such  as  business 
people,  government  employees,  and  military  personnel.  Our  results, 
therefore,  should  be  taken  to  heart  by  all  those  who  design  decision 
aids.  The  problems  such  designers  should  face  are  exacerbated  by  our 
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previous  findings  that  this  same  population  of  subjects  often  hold 
erroneous  knowledge  of  ordinary  facts  (Lichtenstein,  1987). 

In  this  age  of  $5  electronic  calculators  and  $500  computers,  lack 
of  arithmetic  skills  might  seem  unimportant.  But  some  of  the  errors 
our  subjects  made  are  unlikely  to  be  cured  by  the  availability  of  such 
tools.  An  "upside-down”  division  (e.g.,  2/50  ■  25)  can  be  performed 
easily  on  a  calculator.  And  given  the  sentence  "[The  screening  test] 
gives  a  positive  (dyslexia)  result  for  52  of  the  normal  children,  the 
ones  who  do  not  have  dyslexia,"  452  of  our  subjects  answered  "52," 
instead  of  "952,"  to  the  question,  "What  percentage  of  the  time  is  the 
screening  able  to  correctly  identify  children  that  actually  do  not  have 
dyslexia?"  Electronic  calculators  will  be  of  no  help  for  such 
misunderstandings . 

Great  care  should  be  taken,  we  conclude,  that  decision  aids  be 
tested  on  the  population  for  which  they  are  intended,  to  avoid  as  much 
as  possible  problems  arising  from  unexpected  deficits  in  users' 
knowledge  and  skills. 
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